Integration of ChIP-seq and machine learning reveals enhancers and a predictive regulatory sequence vocabulary in melanocytes.
Identifieur interne : 002245 ( Main/Exploration ); précédent : 002244; suivant : 002246Integration of ChIP-seq and machine learning reveals enhancers and a predictive regulatory sequence vocabulary in melanocytes.
Auteurs : David U. Gorkin [États-Unis] ; Dongwon Lee ; Xylena Reed ; Christopher Fletez-Brant ; Seneca L. Bessling ; Stacie K. Loftus ; Michael A. Beer ; William J. Pavan ; Andrew S. MccallionSource :
- Genome research [ 1549-5469 ] ; 2012.
Descripteurs français
- KwdFr :
- Algorithmes, Analyse de séquence d'ADN (), Animaux, Danio zébré, Facteurs de transcription (métabolisme), Gènes rapporteurs, Génome humain, Histone (métabolisme), Humains, Immunoprécipitation de la chromatine (), Intelligence artificielle, Mélanocytes (métabolisme), Protéine p300-E1A (génétique), Protéine p300-E1A (métabolisme), Régulation de l'expression des gènes, Souris, Éléments activateurs (génétique), Évolution moléculaire.
- MESH :
- génétique : Protéine p300-E1A.
- métabolisme : Facteurs de transcription, Histone, Mélanocytes, Protéine p300-E1A.
- Algorithmes, Analyse de séquence d'ADN, Animaux, Danio zébré, Gènes rapporteurs, Génome humain, Humains, Immunoprécipitation de la chromatine, Intelligence artificielle, Régulation de l'expression des gènes, Souris, Éléments activateurs (génétique), Évolution moléculaire.
English descriptors
- KwdEn :
- Algorithms, Animals, Artificial Intelligence, Chromatin Immunoprecipitation (methods), E1A-Associated p300 Protein (genetics), E1A-Associated p300 Protein (metabolism), Enhancer Elements, Genetic, Evolution, Molecular, Gene Expression Regulation, Genes, Reporter, Genome, Human, Histones (metabolism), Humans, Melanocytes (metabolism), Mice, Sequence Analysis, DNA (methods), Transcription Factors (metabolism), Zebrafish.
- MESH :
- chemical , genetics : E1A-Associated p300 Protein.
- chemical , metabolism : E1A-Associated p300 Protein, Histones, Transcription Factors.
- metabolism : Melanocytes.
- methods : Chromatin Immunoprecipitation, Sequence Analysis, DNA.
- Algorithms, Animals, Artificial Intelligence, Enhancer Elements, Genetic, Evolution, Molecular, Gene Expression Regulation, Genes, Reporter, Genome, Human, Humans, Mice, Zebrafish.
Abstract
We take a comprehensive approach to the study of regulatory control of gene expression in melanocytes that proceeds from large-scale enhancer discovery facilitated by ChIP-seq; to rigorous validation in silico, in vitro, and in vivo; and finally to the use of machine learning to elucidate a regulatory vocabulary with genome-wide predictive power. We identify 2489 putative melanocyte enhancer loci in the mouse genome by ChIP-seq for EP300 and H3K4me1. We demonstrate that these putative enhancers are evolutionarily constrained, enriched for sequence motifs predicted to bind key melanocyte transcription factors, located near genes relevant to melanocyte biology, and capable of driving reporter gene expression in melanocytes in culture (86%; 43/50) and in transgenic zebrafish (70%; 7/10). Next, using the sequences of these putative enhancers as a training set for a supervised machine learning algorithm, we develop a vocabulary of 6-mers predictive of melanocyte enhancer function. Lastly, we demonstrate that this vocabulary has genome-wide predictive power in both the mouse and human genomes. This study provides deep insight into the regulation of gene expression in melanocytes and demonstrates a powerful approach to the investigation of regulatory sequences that can be applied to other cell types.
DOI: 10.1101/gr.139360.112
PubMed: 23019145
Affiliations:
Links toward previous steps (curation, corpus...)
- to stream PubMed, to step Corpus: 001D44
- to stream PubMed, to step Curation: 001D44
- to stream PubMed, to step Checkpoint: 001C71
- to stream Ncbi, to step Merge: 000999
- to stream Ncbi, to step Curation: 000999
- to stream Ncbi, to step Checkpoint: 000999
- to stream Main, to step Merge: 002270
- to stream Main, to step Curation: 002245
Le document en format XML
<record><TEI><teiHeader><fileDesc><titleStmt><title xml:lang="en">Integration of ChIP-seq and machine learning reveals enhancers and a predictive regulatory sequence vocabulary in melanocytes.</title>
<author><name sortKey="Gorkin, David U" sort="Gorkin, David U" uniqKey="Gorkin D" first="David U" last="Gorkin">David U. Gorkin</name>
<affiliation wicri:level="2"><nlm:affiliation>McKusick-Nathans Institute of Genetic Medicine, Johns Hopkins University School of Medicine, Baltimore, MD 21205, USA.</nlm:affiliation>
<country xml:lang="fr">États-Unis</country>
<wicri:regionArea>McKusick-Nathans Institute of Genetic Medicine, Johns Hopkins University School of Medicine, Baltimore, MD 21205</wicri:regionArea>
<placeName><region type="state">Maryland</region>
</placeName>
</affiliation>
</author>
<author><name sortKey="Lee, Dongwon" sort="Lee, Dongwon" uniqKey="Lee D" first="Dongwon" last="Lee">Dongwon Lee</name>
</author>
<author><name sortKey="Reed, Xylena" sort="Reed, Xylena" uniqKey="Reed X" first="Xylena" last="Reed">Xylena Reed</name>
</author>
<author><name sortKey="Fletez Brant, Christopher" sort="Fletez Brant, Christopher" uniqKey="Fletez Brant C" first="Christopher" last="Fletez-Brant">Christopher Fletez-Brant</name>
</author>
<author><name sortKey="Bessling, Seneca L" sort="Bessling, Seneca L" uniqKey="Bessling S" first="Seneca L" last="Bessling">Seneca L. Bessling</name>
</author>
<author><name sortKey="Loftus, Stacie K" sort="Loftus, Stacie K" uniqKey="Loftus S" first="Stacie K" last="Loftus">Stacie K. Loftus</name>
</author>
<author><name sortKey="Beer, Michael A" sort="Beer, Michael A" uniqKey="Beer M" first="Michael A" last="Beer">Michael A. Beer</name>
</author>
<author><name sortKey="Pavan, William J" sort="Pavan, William J" uniqKey="Pavan W" first="William J" last="Pavan">William J. Pavan</name>
</author>
<author><name sortKey="Mccallion, Andrew S" sort="Mccallion, Andrew S" uniqKey="Mccallion A" first="Andrew S" last="Mccallion">Andrew S. Mccallion</name>
</author>
</titleStmt>
<publicationStmt><idno type="wicri:source">PubMed</idno>
<date when="2012">2012</date>
<idno type="RBID">pubmed:23019145</idno>
<idno type="pmid">23019145</idno>
<idno type="doi">10.1101/gr.139360.112</idno>
<idno type="wicri:Area/PubMed/Corpus">001D44</idno>
<idno type="wicri:explorRef" wicri:stream="PubMed" wicri:step="Corpus" wicri:corpus="PubMed">001D44</idno>
<idno type="wicri:Area/PubMed/Curation">001D44</idno>
<idno type="wicri:explorRef" wicri:stream="PubMed" wicri:step="Curation">001D44</idno>
<idno type="wicri:Area/PubMed/Checkpoint">001C71</idno>
<idno type="wicri:explorRef" wicri:stream="Checkpoint" wicri:step="PubMed">001C71</idno>
<idno type="wicri:Area/Ncbi/Merge">000999</idno>
<idno type="wicri:Area/Ncbi/Curation">000999</idno>
<idno type="wicri:Area/Ncbi/Checkpoint">000999</idno>
<idno type="wicri:Area/Main/Merge">002270</idno>
<idno type="wicri:Area/Main/Curation">002245</idno>
<idno type="wicri:Area/Main/Exploration">002245</idno>
</publicationStmt>
<sourceDesc><biblStruct><analytic><title xml:lang="en">Integration of ChIP-seq and machine learning reveals enhancers and a predictive regulatory sequence vocabulary in melanocytes.</title>
<author><name sortKey="Gorkin, David U" sort="Gorkin, David U" uniqKey="Gorkin D" first="David U" last="Gorkin">David U. Gorkin</name>
<affiliation wicri:level="2"><nlm:affiliation>McKusick-Nathans Institute of Genetic Medicine, Johns Hopkins University School of Medicine, Baltimore, MD 21205, USA.</nlm:affiliation>
<country xml:lang="fr">États-Unis</country>
<wicri:regionArea>McKusick-Nathans Institute of Genetic Medicine, Johns Hopkins University School of Medicine, Baltimore, MD 21205</wicri:regionArea>
<placeName><region type="state">Maryland</region>
</placeName>
</affiliation>
</author>
<author><name sortKey="Lee, Dongwon" sort="Lee, Dongwon" uniqKey="Lee D" first="Dongwon" last="Lee">Dongwon Lee</name>
</author>
<author><name sortKey="Reed, Xylena" sort="Reed, Xylena" uniqKey="Reed X" first="Xylena" last="Reed">Xylena Reed</name>
</author>
<author><name sortKey="Fletez Brant, Christopher" sort="Fletez Brant, Christopher" uniqKey="Fletez Brant C" first="Christopher" last="Fletez-Brant">Christopher Fletez-Brant</name>
</author>
<author><name sortKey="Bessling, Seneca L" sort="Bessling, Seneca L" uniqKey="Bessling S" first="Seneca L" last="Bessling">Seneca L. Bessling</name>
</author>
<author><name sortKey="Loftus, Stacie K" sort="Loftus, Stacie K" uniqKey="Loftus S" first="Stacie K" last="Loftus">Stacie K. Loftus</name>
</author>
<author><name sortKey="Beer, Michael A" sort="Beer, Michael A" uniqKey="Beer M" first="Michael A" last="Beer">Michael A. Beer</name>
</author>
<author><name sortKey="Pavan, William J" sort="Pavan, William J" uniqKey="Pavan W" first="William J" last="Pavan">William J. Pavan</name>
</author>
<author><name sortKey="Mccallion, Andrew S" sort="Mccallion, Andrew S" uniqKey="Mccallion A" first="Andrew S" last="Mccallion">Andrew S. Mccallion</name>
</author>
</analytic>
<series><title level="j">Genome research</title>
<idno type="eISSN">1549-5469</idno>
<imprint><date when="2012" type="published">2012</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
</fileDesc>
<profileDesc><textClass><keywords scheme="KwdEn" xml:lang="en"><term>Algorithms</term>
<term>Animals</term>
<term>Artificial Intelligence</term>
<term>Chromatin Immunoprecipitation (methods)</term>
<term>E1A-Associated p300 Protein (genetics)</term>
<term>E1A-Associated p300 Protein (metabolism)</term>
<term>Enhancer Elements, Genetic</term>
<term>Evolution, Molecular</term>
<term>Gene Expression Regulation</term>
<term>Genes, Reporter</term>
<term>Genome, Human</term>
<term>Histones (metabolism)</term>
<term>Humans</term>
<term>Melanocytes (metabolism)</term>
<term>Mice</term>
<term>Sequence Analysis, DNA (methods)</term>
<term>Transcription Factors (metabolism)</term>
<term>Zebrafish</term>
</keywords>
<keywords scheme="KwdFr" xml:lang="fr"><term>Algorithmes</term>
<term>Analyse de séquence d'ADN ()</term>
<term>Animaux</term>
<term>Danio zébré</term>
<term>Facteurs de transcription (métabolisme)</term>
<term>Gènes rapporteurs</term>
<term>Génome humain</term>
<term>Histone (métabolisme)</term>
<term>Humains</term>
<term>Immunoprécipitation de la chromatine ()</term>
<term>Intelligence artificielle</term>
<term>Mélanocytes (métabolisme)</term>
<term>Protéine p300-E1A (génétique)</term>
<term>Protéine p300-E1A (métabolisme)</term>
<term>Régulation de l'expression des gènes</term>
<term>Souris</term>
<term>Éléments activateurs (génétique)</term>
<term>Évolution moléculaire</term>
</keywords>
<keywords scheme="MESH" type="chemical" qualifier="genetics" xml:lang="en"><term>E1A-Associated p300 Protein</term>
</keywords>
<keywords scheme="MESH" type="chemical" qualifier="metabolism" xml:lang="en"><term>E1A-Associated p300 Protein</term>
<term>Histones</term>
<term>Transcription Factors</term>
</keywords>
<keywords scheme="MESH" qualifier="génétique" xml:lang="fr"><term>Protéine p300-E1A</term>
</keywords>
<keywords scheme="MESH" qualifier="metabolism" xml:lang="en"><term>Melanocytes</term>
</keywords>
<keywords scheme="MESH" qualifier="methods" xml:lang="en"><term>Chromatin Immunoprecipitation</term>
<term>Sequence Analysis, DNA</term>
</keywords>
<keywords scheme="MESH" qualifier="métabolisme" xml:lang="fr"><term>Facteurs de transcription</term>
<term>Histone</term>
<term>Mélanocytes</term>
<term>Protéine p300-E1A</term>
</keywords>
<keywords scheme="MESH" xml:lang="en"><term>Algorithms</term>
<term>Animals</term>
<term>Artificial Intelligence</term>
<term>Enhancer Elements, Genetic</term>
<term>Evolution, Molecular</term>
<term>Gene Expression Regulation</term>
<term>Genes, Reporter</term>
<term>Genome, Human</term>
<term>Humans</term>
<term>Mice</term>
<term>Zebrafish</term>
</keywords>
<keywords scheme="MESH" xml:lang="fr"><term>Algorithmes</term>
<term>Analyse de séquence d'ADN</term>
<term>Animaux</term>
<term>Danio zébré</term>
<term>Gènes rapporteurs</term>
<term>Génome humain</term>
<term>Humains</term>
<term>Immunoprécipitation de la chromatine</term>
<term>Intelligence artificielle</term>
<term>Régulation de l'expression des gènes</term>
<term>Souris</term>
<term>Éléments activateurs (génétique)</term>
<term>Évolution moléculaire</term>
</keywords>
</textClass>
</profileDesc>
</teiHeader>
<front><div type="abstract" xml:lang="en">We take a comprehensive approach to the study of regulatory control of gene expression in melanocytes that proceeds from large-scale enhancer discovery facilitated by ChIP-seq; to rigorous validation in silico, in vitro, and in vivo; and finally to the use of machine learning to elucidate a regulatory vocabulary with genome-wide predictive power. We identify 2489 putative melanocyte enhancer loci in the mouse genome by ChIP-seq for EP300 and H3K4me1. We demonstrate that these putative enhancers are evolutionarily constrained, enriched for sequence motifs predicted to bind key melanocyte transcription factors, located near genes relevant to melanocyte biology, and capable of driving reporter gene expression in melanocytes in culture (86%; 43/50) and in transgenic zebrafish (70%; 7/10). Next, using the sequences of these putative enhancers as a training set for a supervised machine learning algorithm, we develop a vocabulary of 6-mers predictive of melanocyte enhancer function. Lastly, we demonstrate that this vocabulary has genome-wide predictive power in both the mouse and human genomes. This study provides deep insight into the regulation of gene expression in melanocytes and demonstrates a powerful approach to the investigation of regulatory sequences that can be applied to other cell types.</div>
</front>
</TEI>
<affiliations><list><country><li>États-Unis</li>
</country>
<region><li>Maryland</li>
</region>
</list>
<tree><noCountry><name sortKey="Beer, Michael A" sort="Beer, Michael A" uniqKey="Beer M" first="Michael A" last="Beer">Michael A. Beer</name>
<name sortKey="Bessling, Seneca L" sort="Bessling, Seneca L" uniqKey="Bessling S" first="Seneca L" last="Bessling">Seneca L. Bessling</name>
<name sortKey="Fletez Brant, Christopher" sort="Fletez Brant, Christopher" uniqKey="Fletez Brant C" first="Christopher" last="Fletez-Brant">Christopher Fletez-Brant</name>
<name sortKey="Lee, Dongwon" sort="Lee, Dongwon" uniqKey="Lee D" first="Dongwon" last="Lee">Dongwon Lee</name>
<name sortKey="Loftus, Stacie K" sort="Loftus, Stacie K" uniqKey="Loftus S" first="Stacie K" last="Loftus">Stacie K. Loftus</name>
<name sortKey="Mccallion, Andrew S" sort="Mccallion, Andrew S" uniqKey="Mccallion A" first="Andrew S" last="Mccallion">Andrew S. Mccallion</name>
<name sortKey="Pavan, William J" sort="Pavan, William J" uniqKey="Pavan W" first="William J" last="Pavan">William J. Pavan</name>
<name sortKey="Reed, Xylena" sort="Reed, Xylena" uniqKey="Reed X" first="Xylena" last="Reed">Xylena Reed</name>
</noCountry>
<country name="États-Unis"><region name="Maryland"><name sortKey="Gorkin, David U" sort="Gorkin, David U" uniqKey="Gorkin D" first="David U" last="Gorkin">David U. Gorkin</name>
</region>
</country>
</tree>
</affiliations>
</record>
Pour manipuler ce document sous Unix (Dilib)
EXPLOR_STEP=$WICRI_ROOT/Sante/explor/MersV1/Data/Main/Exploration
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 002245 | SxmlIndent | more
Ou
HfdSelect -h $EXPLOR_AREA/Data/Main/Exploration/biblio.hfd -nk 002245 | SxmlIndent | more
Pour mettre un lien sur cette page dans le réseau Wicri
{{Explor lien |wiki= Sante |area= MersV1 |flux= Main |étape= Exploration |type= RBID |clé= pubmed:23019145 |texte= Integration of ChIP-seq and machine learning reveals enhancers and a predictive regulatory sequence vocabulary in melanocytes. }}
Pour générer des pages wiki
HfdIndexSelect -h $EXPLOR_AREA/Data/Main/Exploration/RBID.i -Sk "pubmed:23019145" \ | HfdSelect -Kh $EXPLOR_AREA/Data/Main/Exploration/biblio.hfd \ | NlmPubMed2Wicri -a MersV1
This area was generated with Dilib version V0.6.33. |